A skip-list approach for efficiently processing forecasting queries

نویسندگان

Tingjian Ge

Stanley B. Zdonik

چکیده

Time series data is common in many settings including scientific and financial applications. In these applications, the amount of data is often very large. We seek to support prediction queries over time series data. Prediction relies on model building which can be too expensive to be practical if it is based on a large number of data points. We propose to use statistical tests of hypotheses to choose a proper subset of data points to use for a given prediction query interval. This involves two steps: choosing a proper history length and choosing the number of data points to use within this history. Further, we use an I/O conscious skip list data structure to provide samples of the original data set. Based on the statistics collected for a query workload, which we model as a probability mass function (PMF) over query intervals, we devise a randomized algorithm that selects a set of pre-built models (PM’s) to construct, subject to some maintenance cost constraint when there are updates. Given this set of PM’s, we discuss interesting query processing strategies for not only point queries, but also range, aggregation, and JOIN queries. We conduct a comprehensive empirical study on real world datasets to verify the effectiveness of our approaches and algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

I/O Efficient Search of Large Social Networks

We introduce an I/O efficient algorithm and data structure to support fast decentralized search in large graphs modeling social networks. We structure network data in a homophily-based social hierarchy using an append-only, block-aligned skip list with an embedded tree microindex, which reduces I/O and cache line faults. We further minimize I/O when building the skip list by combining an extend...

متن کامل

Range queries over skip tree graphs

The support for complex queries, such as range, prefix and aggregation queries, over structured peer-to-peer systems is currently an active and significant topic of research. This paper demonstrates how Skip Tree Graph, as a novel structure, presents an efficient solution to that problem area through provision of a distributed search tree functionality on decentralised and dynamic environments....

متن کامل

Range-capable Distributed Hash Tables

In this paper, we present a novel indexing data structure called RDHT (Range capable Distributed Hash Table) derived from skip lists and specifically designed for storing and retrieving geographic data from a structured P2P network overlay. We have developed RDHTs as backend for the DART search engine, whose goal is to efficiently answer complex queries based on semantics and geographical conte...

متن کامل

A Service-oriented Scalable Dictionary in MPI

In this paper we present a distributed, in-memory, message passing implementation of a dynamic ordered dictionary structure. The structure is based on a distributed fine-grain implementation of a skip list that can scale across a cluster of multicore machines. We present a service-oriented approach to the design of distributed data structures in MPI where the skip list elements are active proce...

متن کامل

Simplified Self-Adapting Skip Lists

The Simplified Self-Adapting Skip List, a practical new extension of the Skip List data structure, is designed for use with data that exhibit bias, that is, a nonuniform distribution of queries to set elements. The structure observes an initially unknown degree of bias in queries to a data set and adapts itself to a consistently nearly-optimal configuration, improving search efficiency and spee...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

PVLDB

دوره 1 شماره

صفحات -

تاریخ انتشار 2008

A skip-list approach for efficiently processing forecasting queries

نویسندگان

چکیده

منابع مشابه

I/O Efficient Search of Large Social Networks

Range queries over skip tree graphs

Range-capable Distributed Hash Tables

A Service-oriented Scalable Dictionary in MPI

Simplified Self-Adapting Skip Lists

عنوان ژورنال:

اشتراک گذاری